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Preamble. Data collection for life cycle inventories (LCI) is still 
a critical factor for successful work in the area of Life Cycle 
Assessment (LCA). To facilitate reliable LCI data collection, the 
SETAC-Europe LCA Working Group 'Data Availability and Data 
Quality' was formed in April 1998. The goal of the Working 
Group is to focus on the key features of improving the effi¬ 
ciency and quality of data collection (Fig. 1). 


Exchange 



Fig. 1: The key features of data collection 

A series of five articles presenting the activities of the working 
group and its five subgroups through the past three years is being 
published in Int J LCA. This issue provides the first three sub¬ 
group papers: A description of how to assess data quality is pre¬ 
sented in the article 'Framework for Modelling Data Uncertainty 
in Life Cycle Inventories' (Huijbregts et al. 2001). The second 
article focuses on the driving forces for data exchange (Van Hoof 
et al. 2001). The third article deals with the availability and qual¬ 
ity of energy, transport and waste models (Braam et al. 2001). 


Two forthcoming papers concern recommendations for a stan¬ 
dardized list of environmental interventions (Hischier et al. 2001) 
and interfaces to existing software (Jean et al., in preparation). 

On the occasion of the 11 th Annual Meeting of SETAC-Europe 
in Madrid, Spain, 6-10 May 2001, the working group has met 
for its last session to conclude its work. The full report of the 
SETAC Working Group 'Data Availability and Data Quality' 
will be published after external peer review as a 'Code of Life 
Cycle Inventory Practice' by SETAC (De Beaufort & Bretz 2001). 
The results are fed into the ISO 14048 finalisation process and 
lay a foundation for the newly starting LCI subprogram of the 
UNEP-LCI Life Cycle Initiative. Thus, Madrid has become an 
important step for the future advancement of LCI data avail¬ 
ability and data quality. 
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Abstract. Modelling data uncertainty is not common practice in 
life cycle inventories (LCI), although different techniques are avail¬ 
able for estimating and expressing uncertainties, and for propa¬ 
gating the uncertainties to the final model results. To clarify and 
stimulate the use of data uncertainty assessments in common LCI 
practice, the SETAC working group 'Data Availability and Qual¬ 
ity' presents a framework for data uncertainty assessment in LCI. 
Data uncertainty is divided in two categories: (1) lack of data, 
further specified as complete lack of data (data gaps) and a lack 
of representative data, and (2) data inaccuracy. Filling data gaps 
can be done by input-output modelling, using information for 
similar products or the main ingredients of a product, and apply¬ 
ing the law of mass conservation. Lack of temporal, geographical 
and further technological correlation between the data used and 
needed may be accounted for by applying uncertainty factors to 
the non-representative data. Stochastic modelling, which can be 
performed by Monte Carlo simulation, is a promising technique 
to deal with data inaccuracy in LCIs. 


Keywords: Data gaps; data inaccuracy; data uncertainty; un¬ 
representative data; general framework; life cycle inventory 
(LCI); Monte Carlo simulation; sensitivity analysis; SETAC LCA- 
WG, Data Availability and Data Quality; uncertainty assess¬ 
ment; uncertainty importance 


Introduction 

A major problem affecting the application of LCA-based 
tools concerns the data about resource use and emissions 
that are used in life cycle inventories (Fava et al. 1994, 
Weidema and Wesnaes 1996). Actual monitoring of resource 
use and emissions is not common practice, many producers 
do not provide data necessary for LCAs, and if data are 
available for the public, these tend to be dated and may not 
reflect current technologies. It is important to know to what 
extent the outcome of an LCA is affected by data uncer¬ 
tainty in the inventory, as it may be helpful for decision 
makers in judging the significance of the differences in prod¬ 
uct comparisons, options for product improvements or the 
assignment of ecolabels. 

Here, data uncertainty is divided into lack of data, and data 
inaccuracy. In addition, lack of data is further specified as a 
complete lack of data (data gaps) and a lack of representa¬ 
tive data for the product system(s) under study (Fig. 1). How 
to handle the uncertainty related to these types of data un¬ 
certainty is described below. 



Fig. 1: Division of different types of data uncertainty 


Data gaps 


Unrepresentative 

data 


1 Lack of Data 
1.1 Data gaps 

There may be gaps in data concerning (i) flows between eco¬ 
nomic processes and (ii) interventions with the environment. 
Generally, missing information in LCIs is implicitly set to 


zero. Errors introduced by such omissions cause a system¬ 
atic bias towards lower values. Ignorance is thereby re¬ 
warded: a comparison between a well-documented process 
and its less completely analyzed counterpart will be biased 
towards favoring the latter - a very undesirable result of an 
LCI. Obviously, the lack of (specified) data on inter-process 
flows and environmental interventions should be prevented 
as much as possible. In the (near) future, the recommenda¬ 
tions on a standard inventory list including guidelines for 
reporting sum parameters may prevent the occurence of data 
gaps to some extent (Hischier et al. 2001). Nevertheless, 
dealing with data gaps in current data sets will still be nec¬ 
essary. Possibilities to do this are given below. 


1.1.1 Inter-process flows 

If a process system is modelled for the collection of LCI 
data, all known process steps of significance should be con¬ 
sidered, with their known or estimated amounts entering or 
leaving the main system. If life cycle inventory data for a pro¬ 
duction process are impossible to obtain, input-output mod¬ 
els, such as the Economic Input Output-Life Cycle Assess¬ 
ment (Carnegie Mellon University 1999, Hendrickson et al. 
1998), LCNetBase (Norris 1997) and the Missing Inventory 
Estimation Tool (Suh 2000), may be used instead. Using the 
estimated price of missing flows as input, direct and indirect 
environmental emissions and resources can be calculated with 
these models. The input-output model approach should be 
seen as an estimation method for lacking data, as there are 
important limitations involved related to the use of current 
input-output models in LCA case studies (Carnegie Mellon 
University 1999, Suh 2000). One drawback of the available 
software is that it only covers the United States, while infor¬ 
mation representative for other regions may be needed in the 
processes under study. Furthermore, while the input-output 
models strive to include comprehensive data, some sources 
are incomplete. For instance, capital goods are not counted as 
inputs in the input-output convention, toxic emissions are not 
reported for all industrial sectors, and some impacts such as 
land use and habitat destruction are not included. Another 
problem is the level of aggregation. Even with 500 sectors, 
more detailed information on particular products or processes 
is in most cases necessary in LCIs. 

Inter-process flows for which data are lacking can also be es¬ 
timated by using information for the most similar 'analogous' 
process or product for which data are available. Similar flows 
with respect to chemical, physical or other comparable prop¬ 
erties for which process data are known may be used for this 
purpose. For example, missing data for a pigment additive in 
LDPE may be estimated by the process "production of inor¬ 
ganic chemicals" in the ETH-database (Frischknecht et al. 
1996). Another option in this respect is to replace small quan¬ 
tity additives not with their closest analog, but with the main 
ingredients of the product. For instance, Boustead (1997) re¬ 
places additives in polyurethane products by their main ingre¬ 
dients, TDI, MDI and polyol. Expert judgements about 'what 
is reasonably similar' or 'what can reasonably be replaced by 
the main ingredients of the product' will differ from practitio¬ 
ner to practitioner, introducing a degree of arbitrariness to the 
results of these two approaches. 
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1.1.2 Environmental interventions 

Identifying data gaps in environmental interventions, such 
as pollutant flows and mineral extractions per unit of pro¬ 
cess output, may be more complicated than finding data gaps 
in inter-process flows; on the other hand, if some informa¬ 
tion on the inputs and outputs is available, it can often be 
inferred that the adequate book-keeping of raw materials 
and energies and the appropriate analytics of effluents has 
been performed (Bretz and Frankhouser 1996). An inspec¬ 
tion of the parameter lists found in literature sources for 
LCIs often reveals obvious gaps. As with data on inter-pro- 
cess flows, more complete accounting for environmental 
interventions may also be obtained from closest analog pro¬ 
cesses. For example, the inventory for coal-fired steam boil¬ 
ers in Habersatter (1991) shows no heavy metal emissions. 
The data user can try to fill the gaps with data from other 
sources (e.g. Habersatter and Fecker 1998). Other sources 
may also be used to further specify sum parameters, such as 
'hydrocarbon emissions', in more detail. For instance, hy¬ 
drocarbon emissions related to the Polypropylene produc¬ 
tion (Boustead 1993) may be further specified by more de¬ 
tailed information front Frischknecht et al. (1996). 

Furthermore, a comparison between the process inputs and 
stoichiometry (e.g. of heavy metals, CFCs etc. going into 
the process), and the composition of the intermediate good, 
often reveals gaps in the output parameter list. The law of 
mass conservation frequently helps to fill data gaps here. 
Known constituents of a product must appear somewhere 
on the input list, and known inputs that do not remain in 
the final product must leave the process as waste. A major 
drawback of mass balance calculations may be the underes¬ 
timation of relative losses with high environmental impacts. 
There may also be a failure in identifying the final fate of 
the wastes or the environmental media to which the emis¬ 
sions are released. Nevertheless, using a mass balance is 
clearly superior to a total disregard of the mass balance. 

1.2 Lack of representative data 

Not only complete lack of data, but also lack of representa¬ 
tive data between the product system under study and the 
actual data may result in unreliable results. Temporal, geo¬ 
graphical and further technological correlation between the 


data used and data needed should be considered (Table 1, 
Weidema 1998). According to Weidema (1998), "the indi¬ 
cator 'temporal correlation' expresses the degree of accor¬ 
dance between the year of the study and the year of collec¬ 
tion of the obtained data. The indicator 'geographical 
correlation' expresses the degree of accordance between the 
production conditions in the area relevant for the study and 
in the geographical area covered by the data obtained. The 
indicator 'further technological correlation' concerns all other 
aspects of correlation than the temporal and geographical 
considerations.” As can be derived from the definitions of 
the three data quality indicators, they are related to the con¬ 
ditions under which the data are valid, and therefore depen¬ 
dent on the goals of the study in which the data are applied. 

It should be stressed that the scores in Table 1 serve as iden¬ 
tification numbers only, and should not be interpreted as a 
certain quantitative representation of data representative¬ 
ness (Weidema 1998). Although a quantitative assessment 
of the uncertainty related to the use of unrepresentative data 
within an LCI study may be preferable, it is also extremely 
difficult. For instance, imagine quantifying the uncertainty 
arising from using data for Canadian car manufacturing in 
1980, while one needs data for tractor manufacturing in the 
US for the year 1995. 

A procedure to quantitatively deal with unrepresentative data 
in LCIs, consistent with the notion of systematic bias, may 
be very valuable. In this respect, it may be useful to include 
so-called 'uncertainty factors' for non-representative data 
in LCIs. An uncertainty factor may be introduced for each 
data quality indicator (Equation 1), where the actual uncer¬ 
tainty factor may depend on the temporal, geographical and 
technical distances between the modelled process and the 
data used. Quantitative estimates of these uncertainty fac¬ 
tors may be obtained through empirical analysis of time se¬ 
ries and cross-sectional data on process inputs and releases 
per unit of output. Note that the introduction of uncertainty 
factors in LCIs is analogous to the application of 'assess¬ 
ment factors' in the derivation of environmental quality 
objectives for ecological risk assessment (EC 1996). 

K, k =UF lk xUF gM xUF flk xE Xtk (D 

in which E' xk is the corrected emission of substance x per 
unit process k (kg); UF tk is the uncertainty factor represent- 


Table 1 : Pedigree matrix with three data quality indicators (taken from Weidema 1998) 


Indicator score 

1 

2 

3 

4 

5 

Temporal 

correlation 

Less than 3 years of 
difference to year of 
study 

Less than 6 years 
difference 

Less than 10 years 
difference 

Less than 15 years 
difference 

Age of data unknown 
or more than 15 years 
of difference 

Geographical 

correlation 

Data from area under 
study 

Average data from 
larger area in which 
the area under study 
is included 

Data from area with 
similar production 
conditions 

Data from area with 
slightly similar 
production conditions 

Data from unknown 
area 

or area with very 
different production 
conditions 

Further technological 
correlation 

Data from enterprises, 
processes and 
materials under study 

Data from processes 
and materials under 
study but from 
different enterprises 

Data from processes 
and materials under 
study but from 
different technology 

Data on related 
processes or materials 
but from same 
technology 

Data on related 
processes or materials 
but from different 
technology 
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ing the temporal correlation between the data used and 
needed for unit process k (dimensionless); UF gk is the uncer¬ 
tainty factor representing the geographical correlation be¬ 
tween the data used and needed for unit process k (dimen¬ 
sionless); UF ft k is the uncertainty factor representing the 
further technological correlation between the data used and 
needed for unit process k (dimensionless); and E x k is the 
initial emission of substance x per unit process k (kg). 

Finally, it should be noted that Table 1 contains only three 
of the five data quality indicators identified by Weidema 
and Wesnces (1996) and Weidema (1998). The reason is that 
the other two data quality impact indicators, 'reliability of 
the source' and 'completeness', are not related to the issue 
of data representativeness. These two indicators are cov¬ 
ered in a more quantitative way in the sections 'Data gaps' 
and 'Data inaccuracy' in this paper. However, for a quick 
data quality overview the use of the semi-qualitative indica¬ 
tors 'reliability of the source' and 'completeness' may still 
be valuable in LCIs. The reader is referred to Weidema and 
Wesnaes (1996) and Weidema (1998) for a comprehensive 
overview of the application of these two other indicators in 
LCI data quality assessments. 

2 Data Inaccuracy 

Data inaccuracy may be caused by unprecise measurement 
methods, (expert) estimations and assumptions, measure¬ 
ments from a small number of sites, and inadequate time 
periods of measurements pertinent to the processes involved 
(Huijbregts 1998a). Various methods have been proposed 
to make data inaccuracy operational in LCA outcomes, such 
as analytical uncertainty propagation methods (Hoffman 
et al. 1995, Heijungs 1996), calculation with intervals and 
fuzzy logic (Chevalier and Le Teno 1996, Becalli et al. 1997), 
and stochastic modelling (Kennedy et al. 1996, Huijbregts 
1998 b, Maurice et al. 2000). In particular, stochastic mod¬ 
elling, which can be performed by Monte Carlo simula¬ 
tion, seems to be a promising technique for making data 
inaccuracy in LCIs operational, as Monte Carlo simulation 
is widely recognised as a valid technique and the level of 
mathematics required to perform a Monte Carlo simula¬ 
tion is quite basic (Vose 1996). An explanation of the steps 
involved in assessing LCI data inaccuracy by Monte Carlo 
simulation is given below. Fig. 3 shows an overview of the 
proposed procedure. 

First of all, to perform Monte Carlo simulation, param¬ 
eters have to be specified as uncertainty distributions. In 
practice, however, it will be a very difficult and time-con¬ 
suming exercise to characterise the uncertainty ranges for 
the enormous amount of parameters involved in the inven¬ 
tory analysis. Selecting and applying one or more appro¬ 
priate methods for identifying and assessing the most im¬ 
portant input uncertainties is of vital importance in these 
cases (Burmaster and Anderson 1994). It is recommended 
to first perform a sensitivity analysis, sometimes called a 
perturbation analysis, of the deterministic calculations to 
determine the parameters important for the probabilistic 
simulations (Burmaster and Anderson 1994, Heijungs 


Start uncertainly analysis j 
_ X. _, 


Specify input parameters 



Rg. 3: Scheme for the analysis of data inaccuracy in LCI 


1996). Here, the focus may be on the influence of input 
parameters on outcomes of the inventory, characterisation, 
normalisation and/or final weightings between impact cat¬ 
egories. If the focus is on the outcomes of the inventory, 
inter-process data, environmental interventions, and data 
underlying the functional unit should be varied in the sen¬ 
sitivity analysis. Prices of missing flows, used in input-out- 
put models, and uncertainty factors, representing tempo¬ 
ral, geographical and technical distances between the 
modelled process and the data used, should also be varied 
in the sensitivity analysis, if this information is used to pre¬ 
vent the lack of (representative) inventory data. In general 
only a few input parameters will be important to specify in 
further detail, because (i) the values of some inputs account 
for a dominant fraction of the environmental impact and/ 
or (ii) the ranges of some inputs account for a dominant 
fraction of the range in environmental impact (Heijungs 
1996). In this step one may elect to use a single standard 
sensitivity range for all parameters (e.g ± 10%), although 
this approach may fail to highlight highly inaccurate input 
parameters with only modest influence. A rough solution 
may be found in the use of a number of wide standard sen- 
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sitivity ranges and distribution functions for different types of 
environmental interventions and economic processes 
(Finnveden and Lindfors 1998, Huijbregts 1998b). A com¬ 
plementary strategy to simplify the analysis is to implement 
sensitivity ranges for accumulated environmental interventions 
instead of individual parameters in LCA inventories (see 
Kennedy et al. 1996, Huijbregts 1998b). This simplification 
may be particularly useful in the analysis of the potential im¬ 
portance of background data in product assessments. For ex¬ 
ample, ranges for accumulated environmental interventions 
per 'MJ heat production' or 't km transport' could be imple¬ 
mented in the sensitivity analysis. Whenever the sensitivity 
analysis shows that some of these accumulated environmen¬ 
tal interventions may contribute substantially to the uncer¬ 
tainty in model outcomes, a reconstruction of parts of the ac¬ 
cumulated inventory will be necessary (Huijbregts 1998b). 

The next step is specifying the uncertainty distributions of 
the potentially important parameters in more detail. A rule 
of thumb could be that the parameters which together cover, 
for instance, 90% or 95% of the sensitivity range, should 
get priority in data refinement efforts. As noted before, a 
quantitative uncertainty analysis is generally complicated by 
a lack of knowledge about actual uncertainty of data on 
inputs to and outputs from industrial processes. One substi¬ 
tute for lack of knowledge is the use of expert judgement to 
estimate uncertainty ranges (Morgan et al. 1985, Morgan 
and Henrion 1990, Otway and Winterfeldt 1992, Taylor 
1993). In addition, a description in databases of uncertainty 
ranges for environmental interventions per unit process 
would substantially improve the feasibility of performing a 
standard uncertainty analysis in LCAs. A first step in this 
respect is the development of a common database format, 
in which uncertainty ranges for average life cycle inventory 
data should be listed (Singhofen et al. 1996, Weidema 1999). 

After specifying the uncertainty distributions for the most 
important input parameters, a second sensitivity analysis 
should be applied. This second analysis is necessary because 
defining parameters in more detail generally results in smaller 
ranges, which may alter their importance (Ragas et al. 1999). 
Roughly defined input parameters, found important in the 
second sensitivity analysis, should also be specified in more 
detail. If all important input parameters are specified in de¬ 
tail, the Monte Carlo simulation can be applied. This method 
varies all the parameters at random, but the variation is re¬ 
stricted by the given uncertainty distribution for each pa¬ 
rameter. The randomly selected values from all the param¬ 
eter uncertainty distributions are inserted in the output- 
equation. Repeated calculations produce a distribution of 
the predicted output values, reflecting the combined param¬ 
eter uncertainties. A special form of the Monte Carlo tech¬ 
nique is Latin Hypercube simulation which firstly segments 
the uncertainty distribution of a parameter into a number of 
non-overlapping intervals, each having a equal probability. 
This helps stabilize the tails of the output as quickly as pos¬ 
sible (Burmaster and Anderson 1994). A model run gener¬ 
ally consist of 10,000 iterations, which may be considered 
sufficient to obtain a representative frequency chart of the 
output variables (Huijbregts 1998b, Ragas et al. 1999). 


As LCAs are in most cases concerned with relative differ¬ 
ences (between product systems or between improvement 
options), it is vital to take into account the interdependency 
between the product systems under consideration. For in¬ 
stance, environmental interventions due to the same back¬ 
ground processes, such as electricity production, should be 
varied simultaniously for all the product systems under study 
(Huijbregts 1998b). 

Probabilistic uncertainty analysis is useful to make the in¬ 
fluence of input data uncertainty on the uncertainty of the 
model outcomes operational. For the reduction of param¬ 
eter uncertainty, however, more reliable data must be pro¬ 
vided by additional literature research, expert judgement or, 
ideally, measurements. Parameters which cause the largest 
spread in the model outcome should get priority. The con¬ 
tribution of the separate parameters to the total uncertainty 
may be estimated through use of statistical regression tech¬ 
niques (Janssen et al. 1990). 

Finally, it should be noted that, although the techniques 
mentioned above may be useful in dealing with aspects of 
data inaccuracies in LCA, there is some potential for their 
misuse. One risk is that, because the techniques provide tech¬ 
nical-looking results, they may be inappropriately used as a 
low-cost substitute for actual scientific research (Morgan et 
al. 1985). Nevertheless, through LCA peer review and other 
quality control measures, together with accumulating expe¬ 
rience and understanding about uncertainty analysis among 
LCA practitioners, the danger of misusing these techniques 
can be minimized. 

3 Conclusion 

Dealing with data uncertainty should be an integral part of 
every LCA, as it may supply vital information for decision 
makers in judging the significance of the differences in prod¬ 
uct comparisons, options for product improvements or the 
assignment of ecolabels. However, modelling data uncer¬ 
tainty is still not common practice in life cycle inventories 
(LCI). To enhance the use of data uncertainty assessments 
in LCI practice, options to deal with lack of data, unrepre¬ 
sentative data and data inaccuracy are discussed. Filling data 
gaps by input-output modelling, applying uncertainty fac¬ 
tors to non-representative data, and using stochastic model¬ 
ling to deal with data inaccuracy are identified as poten¬ 
tially promising options to deal with the various types of 
data uncertainty. It should be stressed, however, that with¬ 
out empirical justification of uncertainty factors and uncer¬ 
tainty ranges applied to LCI data, the relevance of the un¬ 
certainty analysis will be limited. Further research towards 
clarifying uncertainty estimates in LCI data would be very 
valuable in this respect. 

Finally, to keep the actual performance of LCA case stud¬ 
ies feasible, it is highly recommended to implement in cur¬ 
rent LCA software the various tools dealing with data un¬ 
certainty. 
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